feat: quality improvements — multi-model routing, incremental reviews, structured logging by haasonsaas · Pull Request #35 · evalops/diffscope

haasonsaas · 2026-03-11T19:43:28Z

Summary

Fix panic-prone unwrap() in production paths (LLM adapter model parsing, verification regex captures)
82 new tests across API (36), database (27), state (2), config (4), and incremental SHA tracking (2)
GitHub suggestion blocks with multi-line support (start_line/line) for one-click PR fixes
Incremental review (Incremental review (push-by-push delta) #16): push-by-push delta using last_reviewed_shas tracking and GitHub compare API with graceful fallback to full diff
Multi-model routing (Multi-model routing: triage with cheap models, review with frontier #26): ModelRole::Fast cascades model_fast → model_weak → primary model for lightweight tasks (commit messages, PR summaries, titles)
Wire up dead code: InteractiveCommand → webhook handler, storage methods → API endpoints, model_name() → pipeline logging, detect_context_window → doctor command
Structured logging: all eprintln!/println! replaced with tracing::error!/warn!/info! in server modules
Remove unused code: Plugin trait, legacy Ollama types, inline OllamaShowRequest/Response
New API endpoints: DELETE /api/review/{id}, POST /api/reviews/prune

Closes #8, closes #16, closes #26

Test plan

cargo clippy -- -D warnings passes clean
cargo test passes — 859 tests (up from 777), 0 failures
Verify incremental review flow: open PR → push → verify compare API diff is used
Verify model_fast config cascades correctly in production
Test @diffscope commands in PR comments trigger webhook handler

🤖 Generated with Claude Code

…ws, structured logging, tests - Fix panic-prone unwrap() calls in production paths (adapters, verification) - Add 82 new tests: API layer (36), database layer (27), state (2), config (4), incremental SHA (2), and more - GitHub suggestion blocks: multi-line support with start_line/line for one-click PR fixes - Incremental review: push-by-push delta using last_reviewed_shas and compare API - Multi-model routing: ModelRole::Fast cascades model_fast → model_weak → primary - Wire up dead code: InteractiveCommand → webhook, storage → API, model_name → pipeline, detect_context_window → doctor - Remove unused Plugin trait, legacy Ollama types - Replace all eprintln!/println! with tracing::error!/warn!/info! in server modules - Add delete_review and prune_reviews API endpoints - Reduce cloning by inlining OllamaShowRequest/Response Closes #8, #16, #26 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

devin-ai-integration

Devin Review found 3 potential issues.

View 6 additional findings in Devin Review.

devin-ai-integration · 2026-03-11T19:46:57Z

src/server/api.rs

+    let limit = (per_page * 2) as i64; // fetch extra to merge
+    if let Ok(stored) = state.storage.list_reviews(limit, 0).await {


🔴 list_reviews storage query fetches too few rows, breaking pagination for pages beyond page 1

The new list_reviews implementation fetches at most per_page * 2 reviews from the storage backend with a fixed offset of 0 (src/server/api.rs:590-591). For any page beyond the first, this is insufficient. For example, if there are 0 in-memory reviews and 100 reviews in storage, requesting page 3 with per_page=20 (items 41-60) will return an empty result: the storage query only returns the first 40 items (20*2), so after sorting and slicing at index 40, nothing is available. The correct approach would be to fetch at least page * per_page items from storage (or all items up to a reasonable cap) to ensure later pages are populated.

Suggested change

let limit = (per_page * 2) as i64; // fetch extra to merge

if let Ok(stored) = state.storage.list_reviews(limit, 0).await {

let limit = (page * per_page).max(per_page * 2) as i64; // fetch enough to cover requested page

if let Ok(stored) = state.storage.list_reviews(limit, 0).await {

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-11T19:46:58Z

src/server/api.rs

+    let max_age_secs = params.max_age_days.unwrap_or(30) * 86400;
+    let max_count = params.max_count.unwrap_or(1000);


🔴 Negative max_age_days in prune_reviews can delete all reviews

The prune_reviews endpoint accepts user-supplied max_age_days as Option<i64> and multiplies it directly by 86400 to get max_age_secs (src/server/api.rs:640). A negative value (e.g. max_age_days=-1 → max_age_secs=-86400) means every review's age exceeds the threshold, causing all reviews to be pruned. This is a data-loss risk since the endpoint has no authentication and accepts arbitrary query parameters.

Suggested change

let max_age_secs = params.max_age_days.unwrap_or(30) * 86400;

let max_count = params.max_count.unwrap_or(1000);

let max_age_secs = params.max_age_days.unwrap_or(30).max(1) * 86400;

let max_count = params.max_count.unwrap_or(1000).max(1);

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-11T19:46:59Z

src/server/api.rs

+pub async fn prune_reviews(
+    State(state): State<Arc<AppState>>,
+    Query(params): Query<PruneParams>,
+) -> Json<serde_json::Value> {
+    let max_age_secs = params.max_age_days.unwrap_or(30) * 86400;
+    let max_count = params.max_count.unwrap_or(1000);
+
+    match state.storage.prune(max_age_secs, max_count).await {
+        Ok(pruned) => {
+            info!("Pruned {} old reviews", pruned);
+            Json(serde_json::json!({ "ok": true, "pruned": pruned }))
+        }
+        Err(e) => {
+            warn!("Prune failed: {}", e);
+            Json(serde_json::json!({ "ok": false, "error": e.to_string() }))
+        }
+    }
+}


🟡 prune_reviews endpoint only prunes storage backend, leaving in-memory state inconsistent

The new prune_reviews endpoint at src/server/api.rs:636-653 calls state.storage.prune(...) but does not prune the in-memory state.reviews HashMap. After a prune, the list_reviews endpoint will still return pruned reviews from the in-memory portion (added at src/server/api.rs:584-587), and get_review will also find them in memory (checked first at src/server/api.rs:563-567). The in-memory and storage views become inconsistent until the server restarts or in-memory reviews are naturally evicted by AppState::prune_old_reviews.

Prompt for agents

In src/server/api.rs, the prune_reviews function (lines 636-653) should also prune reviews from the in-memory state.reviews HashMap to stay consistent with what was pruned from storage. After the storage.prune() call succeeds, add logic to also remove matching reviews from state.reviews (e.g., reviews older than max_age_secs and over max_count). You could reuse or extend AppState::prune_old_reviews from src/server/state.rs:379 to accept the same parameters.

Was this helpful? React with 👍 or 👎 to provide feedback.

…apter - Fix list_reviews pagination: fetch enough rows from storage to cover the requested page, not just 2x per_page - Fix prune_reviews: also prune in-memory state, clamp max_age_days and max_count to >= 1 to prevent negative values deleting all reviews - Avoid redundant adapter allocation in smart_review when fast model matches primary model — reuse existing adapter reference - Run cargo fmt to fix CI formatting failure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

devin-ai-integration

Devin Review found 4 potential issues.

View 6 additional findings in Devin Review.

devin-ai-integration · 2026-03-11T19:52:07Z

src/core/offline.rs

🟡 Context window detection lost model_info fallback, causing missed detection for many Ollama models

The old detect_model_context_window in src/commands/doctor.rs had two fallback strategies for detecting context window size: (1) parsing num_ctx from the parameters string, and (2) checking model_info for context_length, llama.context_length, and general.context_length. The replacement OfflineModelManager::detect_context_window (src/core/offline.rs:284-304) only implements strategy (1) via parse_num_ctx_from_params. Many Ollama models expose their context window through model_info fields rather than the parameters string, so these models will now show no context window information in diffscope doctor.

(Refers to lines 300-304)

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-11T19:52:08Z

src/server/github.rs

+                            drop(config);
+
+                            match crate::adapters::llm::create_adapter(&model_config) {
+                                Ok(adapter) => match cmd.execute(adapter.as_ref(), None).await {


🔴 issue_comment webhook handler passes None diff to review command, making @diffscope review non-functional

The new issue_comment webhook handler at src/server/github.rs:504 calls cmd.execute(adapter.as_ref(), None), always passing None for diff_content. When the command is @diffscope review, execute_review (src/core/interactive.rs:79) checks if let Some(diff) = diff_content and returns "No diff content available for review." when it's None. Since the handler never fetches the PR diff from GitHub before executing, the review command is entirely non-functional via webhook comments — it will always respond with that error message.

Prompt for agents

In src/server/github.rs, inside the issue_comment handler (around line 493-511), before executing LLM-based commands like Review or Explain, fetch the PR diff from GitHub. The issue payload contains issue.pull_request.url if the issue is a PR. Use this to detect if it's a PR, fetch the diff via the GitHub API (similar to how the pull_request handler does it via fetch_full_pr_diff), and pass it to cmd.execute(). For non-PR issues or if the diff fetch fails, pass None as a fallback. Also, you'll need to resolve the auth_token properly (see BUG-0002) to be able to fetch the diff.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-11T19:52:09Z

src/server/github.rs

+                    if let Some(ref auth) = token {
+                        let comment_url = format!(
+                            "https://api.github.com/repos/{}/issues/{}/comments",
+                            repo, issue_number
+                        );
+                        let _ = state
+                            .http_client
+                            .post(&comment_url)
+                            .header("Authorization", format!("Bearer {}", auth))
+                            .header("User-Agent", "DiffScope")
+                            .json(&serde_json::json!({ "body": response_body }))
+                            .send()
+                            .await;
+                    }


🔴 issue_comment webhook handler ignores GitHub App auth, so responses are never posted for App deployments

In the pull_request handler, when a GitHub App is configured (with github_app_id and github_private_key), the code obtains an installation token via get_installation_token(). However, the new issue_comment handler at src/server/github.rs:514 only checks if let Some(ref auth) = token, where token is config.github_token (a PAT). For GitHub App deployments — which CLAUDE.md mandates as the recommended auth method — github_token is typically None, so the response comment will silently never be posted. The handler should obtain an installation token from the webhook payload's installation.id, just like the pull_request handler does.

Prompt for agents

In src/server/github.rs, inside the issue_comment handler (around line 467-536), add GitHub App installation token resolution similar to the pull_request handler. Extract the installation_id from payload["installation"]["id"], and if github_app_id and private_key are available, call get_installation_token() to get an auth token. Use that token (falling back to the PAT token) when posting the response comment at line 514. The auth resolution logic can be extracted into a shared helper to avoid duplication with the pull_request handler.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-11T19:52:10Z

src/config.rs

+            ModelRole::Fast => self
+                .model_fast
+                .as_deref()
+                .or(self.model_weak.as_deref())
+                .unwrap_or(&self.model),


🟡 Fast model role defaults to smaller models, violating CLAUDE.md convention

CLAUDE.md states: "Use frontier models for reviews — never default to smaller models." The new ModelRole::Fast fallback chain at src/config.rs:926-930 is model_fast -> model_weak -> primary. When model_fast is not set but model_weak is configured (e.g., claude-haiku), the Fast role silently defaults to the smaller/cheaper Weak model for PR summaries, commit messages, PR titles, and diagram generation. These are all part of the review pipeline. While the core code analysis still uses the frontier model, this fallback chain means the review pipeline as a whole can default to smaller models without the user explicitly opting in via model_fast.

Suggested change

ModelRole::Fast => self

.model_fast

.as_deref()

.or(self.model_weak.as_deref())

.unwrap_or(&self.model),

ModelRole::Fast => self

.model_fast

.as_deref()

.unwrap_or(&self.model),

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration bot reviewed Mar 11, 2026

View reviewed changes

haasonsaas merged commit c8af0ed into main Mar 11, 2026
6 checks passed

haasonsaas deleted the feat/quality-improvements branch March 11, 2026 19:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: quality improvements — multi-model routing, incremental reviews, structured logging#35

feat: quality improvements — multi-model routing, incremental reviews, structured logging#35
haasonsaas merged 2 commits intomainfrom
feat/quality-improvements

haasonsaas commented Mar 11, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Uh oh!

devin-ai-integration bot Mar 11, 2026

Uh oh!

devin-ai-integration bot Mar 11, 2026

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Uh oh!

devin-ai-integration bot Mar 11, 2026

Uh oh!

devin-ai-integration bot Mar 11, 2026

Uh oh!

devin-ai-integration bot Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		let limit = (per_page * 2) as i64; // fetch extra to merge
		if let Ok(stored) = state.storage.list_reviews(limit, 0).await {

		let max_age_secs = params.max_age_days.unwrap_or(30) * 86400;
		let max_count = params.max_count.unwrap_or(1000);

Conversation

haasonsaas commented Mar 11, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

haasonsaas commented Mar 11, 2026 •

edited by devin-ai-integration bot

Loading